This report assesses the following in regards to the provided bibliography named bibliography_metrics.rds:
Remarks: - Group ID is in the json, but not in the csv. The group ID makes it possible to directly jump to the reference in the Zotero Library online.
Bibliography Setup
The bibliography is loaded and the DOIs, ISBNs and ISSNs are extracted. In a second step, the corresponding works are downloaded from [OpenAlex(https://openalex.org)].
One measure of the cleanliness of a Bibliography is assessed by checking the number of references that have a DOI. The following table gives an overview over some numbers regarding the DOIs, ISBNs and ISSNs in the bibliography.
Entries with DOIs, ISBNs or ISSNs
To identify a reference, the most widely used identifier is the DOI. The following table shows the number of references with a DOI and the number of unique DOIs.
To consider duplicate ISBNs or ISSNs as duplicates entries in the library is not waranted as e.g. differenc chapters of a book can be separate entries in the library and therefore lead toi duplicates.
DOIs: 3863 (73.8482126%) - 2 duplicates
ISBNs: 453 (8.6599121%) - 57 duplicates
ISSNs: 3301 (63.1045689%) - 2118 duplicates
The following DOIs are duplicates in the bibliography. This table should be empty.
Show the code
# duplicate_isbns <- paste0("https://isbnsearch.org/search?s=", bibliography$dois_bib[duplicated(bibliography$isbns)])# duplicate_issns <- paste0("", bibliography$dois_bib[duplicated(bibliography$issns)])data.frame(Type ="doi",Identifier =sprintf('<a href="https://doi.org/%s" target="_blank">%s</a>', bibliography$dois_bib[duplicated(bibliography$dois_bib)], bibliography$dois_bib[duplicated(bibliography$dois_bib)])) |> knitr::kable(caption ="Duplicate DOIs in the Bibliography",escape =FALSE )
`summarise()` has grouped output by 'publication_year'. You can override using
the `.groups` argument.
Show the code
data <- dplyr::full_join(x = data_bib,y = data_works,by =c("publication_year", "type"))rm(data_bib, data_works)data |> dplyr::filter(publication_year >=1950) |>ggplot() +scale_fill_viridis_d(option ="plasma") +geom_line(aes(x = publication_year, y = count_cumsum /10, colour = type), linetype ="solid") +# Zoterogeom_line(aes(x = publication_year, y = count_oa_cumsum /10, colour = type), linetype ="dashed") +# OpenAlexscale_x_continuous(breaks =seq(1500, 2020, 10) ) +scale_y_continuous("Proportion of publications",sec.axis =sec_axis(~ . *10, name ="Cumulative number of references") # divide by 10 to scale back the secondary axis ) +labs(title ="Publications over time",x ="Year",y ="Number of publications" ) +theme_minimal() +theme(axis.text.y.right =element_text(color ="red")) +theme(legend.position ="bottom") +guides(fill =guide_legend(title ="Legend" ) )
Warning: Removed 36 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 196 rows containing missing values or values outside the scale range
(`geom_line()`).
Show the code
rm(data)
Access Status of References
This is checked by using the OpenAlex retrieved works. Therefore it is li=mited to the works that are on OpenAlex. At the moment, only references with a DOI were retrieved from OpenAlex.
This comparison is at the moment done using the DOIs in the target and the ILK bibliography. Entries which have no DOI can not be compared at the mopment. A comparison could be achieved by using text comparison of the title, but this is not implemenmted yet.
From the 3863 references in the target bibliography, 186 (0.05%) are in the ILK bibliography (3091 references).